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1 


CGACCTGGCC 


GCCGGCCGCT 


CCTCCGCGCG 


CTGTTCCGCA 


CTTGCTGCCC 




51 


TCGCCCGGCC 


CGGAGCGCCG 


CTGCCATGCG 


GCTGGCGCTG 


CTCTGGGCCC 




101 


TGGGGCTCCT 


GGGCGCGGGC 


AGCCCTCTGC 


CTTCCTGGCC 


GCTCCCAAAT 




151 


ATAGCCCTGC 


TGTCGATTCC 


CTCAGTACTG 


ICI IGGGGTG 


TCCTGGGACC 




201 


TGCAGGTGGC 


ACTGAGGAGC 


AGCAGGGAGA 


GTCAGAGAAG 


GCCCCGAGGG 




251 


AGCCCTTGGA 


GCCCCAGGTC 


CTTCAGGACG 


ATCTCCCAAT 


TAGCCTCAAA 




301 


AAGGTGCTTC 


AGACCAGTCT 


GCCTGAGCCC 


CTGAGGATCA 


AGTTGGAGCT 




351 


GGACGGTGAC 


AGTCATATCC 


TGGAGCTGCT 


ACAGAATAGG 


GAGTTGGTCC 




401 


CAGGCCGCCC 


AACCCTGGTG 


TGGTACCAGC 


CCGATGGCAC 


TCGGGTGGTC 




451 


AGTGAGGGAC 


ACACI 1 IGGA 


GAACTGCTGC 


TACCAGGGAA 


GAGTGCGGGG 




501 


ATATGCAGGC 


TCCTGGGTGT 


CCATCTGCAC 


CTGCTCTGGG 


CTCAGAGGCT 




551 


TGGTGGTCCT 


GACCCCAGAG 


AGAAGCTATA 


CCCTGGAGCA 


GGGGCCTGGG 




601 


GACCTTCAGG 


GTCCTCCCAT 


TATTTCGCGA 


ATCCAAGATC 


TCCACCTGCC 




651 


AGGCCACACC 


TGTGCCCTGA 


GCTGGCGGGA 


ATCTGTACAC 


ACTCAGACGC 




701 


CACCAGAGCA 


CCCCCTGGGA 


CAGCGCCACA 


TTCGCCGGAG 


GCGGGATGTG 




751 


GTAACAGAGA 


CCAAGACTGT 


GGAGTTGGTG 


ATTGTGGCTG 


ATCACTCGGA 




801 


GGCCCAGAAA 


TACCGGGACT 


TCCAGCACCT 


GCTAAACCGC 


ACACTGGAAG 


0 


851 


TGGCCCTCTT 


GCTGGACACA 


TTCTTCCGGC 


CCCTGAATGT 


ACGAGTGGCA 




901 


CTAGTGGGCC 


TGGAGGCCTG 


GACCCAGCGT 


GACCTGGTGG 


AGATCAGCCC 


,,c 
□ 


951 


AAACCCAGCT 


GTCACCCTCG 


AAAACTTCCT 


CCACTGGCGC 


AGGGCACATT 


1001 


TGCTGCCTCG 


ATTGCCCCAT 


GACAGTGCCC 


AGCTGGTGAC 


TGGTACTTCA 


1051 


TTCTCTGGGC 


CTACGGTGGG 


CATGGCCATT 


CAGAACTCCA 


TCTGTTCTCC 




1101 


TGACTTCrCA 


GGAGGTGTGA 


ACATGGACCA 


CTCCACCAGC 


ATCCTGGGAG 


n 


1151 


TCGCCTCCTC 


CATAGCCCAT 


GAGTTGGGCC 


ACAGCCTGGG 


CCTGGACCAT 


M 


1201 


GATTTGCCTG 


GGAATAGCTG 


CCCCTGTCCA 


GGTCCAGCCC 


CAGCCAAGAC 


ru 


1251 


CTGCATCATG 


GAGGCCTCCA 


CAGACTTCCT 


ACCAGGCCTG 


AACTTCAGCA 


: 

tip 


1301 


ACTGCAGCCG 


ACGGGCCCTG 


GAGAAAGCCC 


TCCTGGATGG 


AATGGGCAGC 


1351 


TGCCTCTTCG 


AACGGCTGCC 


TAGCCTACCC 


CCTATGGCTG 


CTTTCTGCGG 


•a ' 

!• BS| . 


1401 


AAATATGTTT 


GTGGAGCCGG 


GCGAGCAGTG 


TGACTGTGGC 


TTCCTGGATG 


1451 


ACTGCGTCGA 


TCCCTGCTGT 


GAI ICI 1 IGA 


CCTGCCAGCT 


GAGGCCAGGT 




1501 


GCACAGTGTG 


CATCTGACGG 


ACCCTGTTGT 


CAAAATTGCC 


AGCTGCGCCC 




1551 


GTCTGGCTGG 


CAGTGTCGTC 


CTACCAGAGG 


GGATTGTGAC 


TTGCCTGAAT 




1601 


TCTGCCCAGG 


AGACAGCTCC 


CAGTGTCCCC 


CTGATGTCAG 


CCTAGGGGAT 




1651 


GGCGAGCCCT 


GCGCTGGCGG 


GCAAGCTGTG 


TGCATGCACG 


GGCGTTGTGC 




1701 


CTCCTATGCC 


CAGCAGTGCC 


AGTCACI 1 IG 


GGGACCTGGA 


GCCCAGCCCG 




1751 


CTGCGCCACT 


TTGCCTCCAG 


ACAGCTAATA 


CTCGGGGAAA 


TGCI 1 1 IGGG 




1801 


AGCTGTGGGC 


GCAACCCCAG 


TGGCAGTTAT 


GTGTCCTGCA 


CCCCTAGAGA 




1851 


TGCCATTTGT 


GGGCAGCTCC 


AGTGCCAGAC 


AGGTAGGACC 


CAGCCTCTGC 




1901 


TGGGCTCCAT 


CCGGGATCTA 


CTCTGGGAGA 


CAATAGATGT 


GAATGGGACT 




1951 


GAGCTGAACT 


GCAGCTGGGT 


GCACCTGGAC 


CTGGGCAGTG 


ATGTGGCCCA 




2001 


GCCCCTCCTG 


ACTCTGCCTG 


GCACAGCCTG 


TGGCCCTGGC 


CTGGTGTGTA 




2051 


TAGACCATCG 


ATGCCAGCGT 


GTGGATCTCC 


TGGGGGCACA 


GGAATGTCGA 




2101 


AGCAAATGCC 


ATGGACATGG 


GGTCTGTGAC 


AGCAACAGGC 


ACTGCTACTG 




2151 


TGAGGAGGGC 


TGGGCACCCC 


CTGACTGCAC 


CACTCAGCTC 


AAAGCAACCA 




2201 


GCTCCCTGAC 


CACAGGGCTG 


CTCCTCAGCC 


TCCTGGTCTT 


ATTGGTCCTG 




2251 


GTGATGCTTG 


GTGCCAGCTA 


CTGGTACCGT 


GCCCGCCTGC 


ACCAGCGACT 
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2301 CTGCCAGCTC AAGGGACCCA CCTGCCAGTA CAGGGCAGCC CAATCTGGTC 
2351 CCTCTGAACG GCCAGGACCT CCGCAGAGGG CCCTGCTGGC ACGAGGCACT 
2401 AAGGCTAGTG CTCTCAGCTT CCCGGCCCCC CCTTCCAGGC CGCTGCCGCC 
2451 TGACCCTGTG TCCAAGAGAC TCCAGTCTCA GGGGCCAGCC AAGCCCCCAC 
2501 CCCCAAGGAA GCCACTGCCT GCCGACCCCC AGGGCCGGTG CCCATCGGGT 
2551 GACCTGCCCG GCCCAGGGGC TGGAATCCCG CCCCTAGTGG TACCCTCCAG 
2601 ACCAGCGCCA CCGCCTCCGA CAGTGTCCTC GCTCTACCTC TGACCTCTCC 
2651 GGAGGTTCCG CTGCCTCCAA GCCGGACTTA GGGCTTCAAG AGGCGGGCGT 
2701 GCCCTCTGGA GTCCCCTACC ATGACTGAAG GCGCCAGAGA CTGGCGGTGT 
2751 CTTAAGACTC CGGGCACCGC CACGCGCTGT CAAGCAACAC TCTGCGGACC 
2801 TGCCGGCGTA GTTGCAGCGG GGGCTTGGGG AGGGGCTGGG GGTTGGACGG 
2851 GATTGAGGAA GGTCCGCACA GCCTGTCTCT GCTCAGTTGC AATAAACGTG 
2901 ACATCTTGGA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
2951 AAAAAAAAAA AAAAAAAA 
(SEQ ID NO: 1) 



in 



ru 



FEATURES: 

S'UTR: 1-75 

Start Codon: 76 

Stop Codon: 2641 

3'UTR: 2644 

Homologous proteins: 

Top 10 BLAST Hits: 

Sequences producing significant alignments: 
value 

CRA | 335001098640323 /altid=gi 17451525 /def=pir| |G02390 disinteg.. 
CRA 1 335001098639998 /altid=gi 111497002 /def=ref |NP_003806.2| a .. 
CRA 1 1000682348196 /altid=gi 19945328 /def=ref |NP_064704.1| a dis.. 
CRA | 18000005154484 /altid=gi 16752962 /def=ref |NP_033744.1| a di.. 
CRA 1 1000737073449 /al ti d=gi 1 6682839 /def=dbj | BAA88903 . 1 1 (AB022 . . 
CRA 1 157000140328366 /altid=gi 112720142 /def=ref |XP_010635.1| a .. 
CRA | 18000005119563 /altid=gi 14501905 /def=ref |NP_003465.1| a di.. 
CRA | 98000043629034 /altid=gi 113027660 /def=gb|AAC08702.2| (AF02.. 
CRA | 18000005009258 /altid=gi 16680640 /def=ref |NP_031426.1| a di.. 
CRA 1 98000043606871 /altid=gi 112802370 /def=gb|AAK07852.1|AF3113. . 



Score 
(bits) 



1714 


0.0 


1698 


0.0 


1377 


0.0 


1351 


0.0 


1319 


0.0 


970 


0.0 


539 


e-152 


539 


e-152 


538 


e-151 


517 


e-145 
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m 

Q 



u 



EST: 

Score 

Sequences producing significant alignments: (bits) 
value 



gi I 12777372 /dataset=dbest /taxon=960. 
gi 110205626 /dataset=dbest /taxon=96. . 
gi 110746030 /dataset=dbest /taxon=96. . 
gi I 12758166 /dataset=dbest /taxon=960. 
gi I 13130161 /dataset=dbest /taxon=960. 
gi I 11003698 /dataset=dbest /taxon=96. . 
gi 112763891 /dataset=dbest /taxon=960. 
gi 1 9124688 /dataset=dbest /taxon=9606. 

EXPRESSION INFORMATION FOR MODULATORY USE: 
14 gi 112777372 placenta 
Q gi 110205626 lung 
Q gi | 10746030 ovary 
gi 112758166 colon 
N gi 113130161 kidney 
% gi 111003698 thyroid gland 
gi 112763891 prostate 
gi 19124688 eye 



1750 0.0 

1364 0.0 

1352 0.0 

1334 0.0 

1306 0.0 

1298 0.0 

1281 0.0 

1211 0.0 



ry Tissue expression: 
M leucocyte 



n 
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1 MRLALLWALG LLGAGSPLPS WPLPNIALLS 
51 AESEKAPREP LEPQVLQDDL PISLKKVLQT 
101 LLQNRELVPG RPTLVWYQPD GTRWSEGHT 
151 CTCSGLRGLV VLTPERSYTL EQGPGDLQGP 
201 RESVHTQTPP EHPLGQRHIR RRRDWTETK 
251 HLLNRTLEVA LLLDTFFRPL NVRVALVGLE 
301 FLHWRRAHLL PRLPHDSAQL VTGTSFSGPT 
351 DHSTSILGVA SSIAHELGHS LGLDHDLPGN 
401 FLPGLNFSNC SRRALEKALL DGMGSCLFER 
451 QCDCGFLDDC VDPCCDSLTC QLRPGAQCAS 
501 RGDCDLPEFC PGDSSQCPPD VSLGDGEPCA 
551 LWGPGAQPAA PLCLQTANTR GNAFGSCGRN 
601 QTGRTQPLLG SIRDLLWETT DVNGTELNCS 
651 ACGPGLVCID HRCQRVDLLG AQECRSKCHG 
701 CTTQLKATSS LTTGLLLSLL VLLVLVMLGA 
751 QYRAAQSGPS ERPGPPQRAL LARGTKASAL 
801 SQGPAKPPPP RKPLPADPQG RCPSGDLPGP 
851 SSLYL 
(SEQ ID NO: 2) 

FEATURES: 

Functional domains and key regions: 

[1] PDOC00001 PS00001 ASN_GLYCOSYI_ATTON 

N-glycosylation site 

Number of matches: 5 

1 254-257 NRTL 

2 406-409 NFSN 

3 409-412 NCSR 

4 623-626 NGTE 

5 628-631 NCSW 



[2] PDOC00005 PS00005 PKC_PHOSPHO_jSrTE 
Protein kinase C phosphorylation site 



Number of matches: 


11 


1 


53-55 


SEK 


2 


73-75 


SLK 


3 


199-201 


SWR 


4 


283-285 


TQR 


5 


411-413 


SRR 


6 


589-591 


TPR 


7 


602-604 


TGR 


8 


611-613 


SIR 


9 


686-688 


SNR 



IPSVLSWGVL GPAGGTEEQQ 
SLPEPLRIKL ELDGDSHILE 
LENCCYQGRV RGYAGSWSI 
PIISRIQDLH LPGHTCALSW 
TVELVIVADH SEAQKYRDFQ 
AWTQRDLVEI SPNPAVTLEN 
VGMAIQNSIC SPDFSGGVNM 
SCPCPGPAPA KTCIMEASTD 
LPSLPPMAAF CGNMFVEPGE 
DGPCCQNCQL RPSGWQCRPT 
GGQAVCMHGR CASYAQQCQS 
PSGSWSCTP RDAICGQLQC 
WVHLDLGSDV AQPLLTLPGT 
HGVCDSNRHC YCEEGWAPPD 
SYWYRARLHQ RLCQLKGPTC 
SFPAPPSRPL PPDPVSKRLQ 
GAGIPPLWP SRPAPPPPTV 
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10 760-762 SER 

11 796-798 SKR 



[3] PDOC00006 PS00006 CK2_PH0SPH0_3rTE 
Casein kinase II phosphorylation site 

Number of matches: 8 

1 81-84 SLPE 

2 199-202 SWRE 

3 208-211 TPPE 

4 283-286 TQRD 

5 500-503 TRGD 

6 522-525 SLGD 

7 589-592 TPRD 

8 611-614 SIRD 



[4] PDOC00008 PS00008 MYRISTVL 
N-myristoylation site 



Number of matches: 


18 


1 


10-15 


GLLGAG 


2 


145-150 


GSWVSI 


3 


323-328 


GTSFSG 


4 


358-363 


GVASSI 


5 


404-409 


GLNFSN 


6 


422-427 


GMGSCL 


7 


475-480 


GAQCAS 


8 


532-537 


GQAVCM 


9 


555-560 


GAQPAA 


10 


571-576 


GNAFGS 


11 


583-588 


GSYVSC 


12 


596-601 


GQLQCQ 


13 


624-629 


GTELNC 


14 


637-642 


GSDVAQ 


15 


670-675 


GAQECR 


16 


682-687 


GVCDSN 


17 


714-719 


GLLLSL 


18 


774-779 


GTKASA 



[5] PDOC00016 PS00016 RGD 
Cell attachment sequence 



501-503 RGD 
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□ 
□ 

m 



[6] PDOC00021 PS01186 EGF_2 
EGF-like domain signature 2 

690-701 CYCEEGWAPPDC 



[7] PDOC00129 PS00142 ZINC_PROTEASE 

Neutral zinc metal! opeptidases, zinc-binding region signature 

362-371 SIAHELGHSL 
Membrane spanning structure and domains: 



Helix Begin 


End 


Score Certainity 


1 


25 


45 


1.602 Certain 


2 


144 


164 


0.925 Putative 


3 


317 


337 


1.237 Certain 


4 


430 


450 


0.768 Putative 


5 


547 


567 


0.601 Putative 


6 


640 


660 


1.243 Certain 


7 


711 


731 


2.394 Certain 



i,H BLAST Alignment to Top Hit: 
Q Alignment to top blast hit: 



>CRA 1 335001098640323 /al ti d=gi 1 745152 5 /def=pi r 1 1 G02390 di si ntegri n 
f and metal loproteinase MDC15 (EC 3.4.24.-) - human 

7 /org=human /taxon=9606 /dataset=nraa /length=814 

J Length = 814 

li score = 1714 bits (4390), Expect = 0.0 

Identities = 812/855 (94%), Positives = 812/855 (94%) 
Frame = +1 

Query: 76 MRU\LLWALGLLGAGSPLPSWPLPNIALLSIPSVLSWGVLGPAGGTEEQQAES EKAPREP 255 

MRLALLWALGLLGAGSPLPSWPLPNI GGTEEQQAES EKAPREP 

Sbjct: 1 MRLALLWALGLLGAGSPLPSWPLPNI GGTEEQQAES EKAPREP 43 

Query: 256 LEPQVLQDDLPISLKKVLQTSLPEPLRIKLELKDSHILELLQNRELVrcRPTLVWYQPO 435 

LEPQVLQDDLPISLKKVLQTSLPEPLRIKLELDGDSHILELLQNRELVrcRPTLVWYQPO 
Sbjct: 44 LEPQVLQPDLPISLKKVLQT5LPEPLRIKLELDGDSHILELLQNRELVPGRPTLVWYQPD 103 

Query: 436 GTRWSEGm-LENCC^QGRWGYAGSWVSICTCSGLRGLWLTPERSYTLEQGPGDLQGP 615 

GTTWVSEGHTLENCCYQGRWGYAGSWVSICT(3GLRGLWLTPERSYTLEQGPGDLQGP 
Sbjct: 104 GTRWSEGKTLENCO'QGRVRGYAGSWVSICTCSGLRGLVVLTPERSYTLEQGPGDLQGP 163 
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Query: 616 piisriqdlhlKjHTCALSWRESVOT^ 795 

PIISRIQDLHLPGHTCALSWRESVrf^^ 

Sbjct: 164 PIISRIQPLHLrcHTCALSWRESVrfTQTPPEHPLGQRHI^ 223 

Query: 796 SEAQIC^I^LLNRTLEVALLU)TFFRPLIWRVALVGLEAVyfrQRDLVEISPNPA\m.EN 975 

SEAQKVTOFQHLLNRTLEVALLLDTFFRPLNVRV^ 

Sbjct: 224 SEAQKYRDrKIHLLNRTLEVALLLDTFFRPL^VALVGLEAWrQRDLVEISPNPAVTLEN 283 

Query: 976 FLHWRRAHLLPRLPHDSAQLWCTSFSGPTVGN^ 1155 

FLHWRRAHLLPRLFWDSAQLWGTSFSGPTVGI^^ 

Sbjct: 284 FLHWRRAHLLPRLPHDSAQLWGT5FSGPWGWQNSI(3roFSGGV^M)HSTSILGVA 343 

Query: 1156 SSIAHELGHSLGLDHDLrcNSCPCPGPAPAKTCIM 1335 

SSIAHELGHSLGmHDLPGNSCPCPGPAPAiacnMEASTDFLPGLNFSNCSRRALEKALL 

Sbjct: 344 SSIAHELGHSLGmHDLPGNSCPCPGPAPAKTCIMEASTDFLPGLNFSNCSRRALEKALL 403 

Query: 1336 DGMGSCLFERLPSLPPIWVFCGNMFVEPGEQCT)CGFIJ)D^ 1515 

DGMGSCLFERLPSLPPMMFCGNMFVEPGEQCL^ 

Sbjct: 404 LX3^SCLFERLPSLPPNIMFCGNMFVEF>GEQCLra 463 

Query: 1516 DGPCCQNCQLRPSGWQCRPTRGIXDLPEFCPGDSSQCProVSLGLXSEPCAGGQ^ 1695 

DGFm^CQLRPSGWQCRPTRGKDL^^ 

Sbjct: 464 DGPCCQNCQLRPSGWQO^PTRGIXDLPEFCF^DSSQCProVSLGDGEPCAGG^ 523 

Query: 1696 CASYAQQCQSLWGPGAQPMPLCLQTA^GNAFGSCGRNPSGSYVSCTPRDAICGQLQC 1875 

CASYAQQCQSLV\CPGAQPMPLCLQTANTRGNAFGSCGRNPSGSYVSCTPRDAICGQLQC 

Sbjct: 524 G^YAQQCQSLWGPGAQPAAPLCLQTANTRGNAFGSCGRNPSGSYVSCTPRDAICGQLQC 583 

Query: 1876 QTGRTQPLLGSIRDLLWETIDVNGTELNC5V^mLGSDVAQPLLTLPGTACGPGLVCED 2055 

QTGRTQPLLGSIRDLLWETTD\^GTELNC5V^mLGSDVAQPLLTLPGTACGPGLVCID 

Sbjct: 584 QTGRTQPLLGSIRDLLWETIDVNGTELNC^VMHIJDLGSDVAQPLLTLPGTACGPGLVCID 643 

Query: 2056 HRCQRVDLLGAQECRSKCHGHGVOSNRHCYCEEGWAPPDCTTQLKATSSLTTGLLLSLL 2235 

HRCQRVDL LGAQECRSKCHGHGVCDSN RHCYCEEGWAPPDCTTQLKATSS LTTGL L LSL L 

Sbjct: 644 HRCQRVDLLGAQECRSKCHGHGVCDSNRHCYCEEGWAPFlXrrrQLKATSSLTTGLLLSLL 703 

Query: 2236 VLLVLVMLGASYWYRARLHQRLCQLKGPTCQYRMQSGPSERPGPPQRALIJVRGTKASAL 2415 

VLLVLVMLGASYWYRARL QRLCQLKGPTCQYRAAQSGPSERPGPPQRALLARGTK 

Sbjct: 704 VLLVLVMLGASYWYRARU<QRLCQLKGPTCQYRMQSGPSERFGPPQRALIJ\RG^ 759 

Query: 2416 SFPAPPSRPLPPDPVSKRLQSQGPAKPPPPRKPLPADPQGRCPSGDLPGPGAGIPPLWP 2595 

SQGPAKPPPPRKPLPADPQGRCPSGDLPGPG GIPPLWP 

Sbjct: 760 SQGPAKPPPPRKPLPADPQGRCPSGDLPGPGPGIPPLWP 799 
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Query: 2596 SRPAPPPPTVSSLYL 2640 

SRPAPPPPTVSSLYL 
Sbjct: 800 SRPAPPPPTVSSLYL 814 (SEQ ID NO:4) 

Hmmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 



Model 


Description 


Score 


E-value 


N 


PF01421 


Reprolysin (M12B) family zinc metalloproteas 


259.3 


5.3e-74 


1 


PF01562 


Reprolysin family propeptide 


128.4 


2.1e-35 


1 


PF00200 


Disintegrin 


70.0 


3.4e-22 


1 


CE00385 


E00385 plateletL_aggregation_activation_i nhib 


26.5 


5.4e-06 


1 


PF00035 


Double-stranded RNA binding motif 


7.2 


1.2 


1 


CE00423 


E00423 stromelysinJL 


4.5 


0.99 


1 


PF01400 


Astacin (Peptidase family M12A) 


2.6 


7.8 


1 



□ 


Parsed for domains: 

Model Domain seq-f seq-t 


hntn-f hnw-t 


score 


E-value 


j: s~ 
j=J 


PF01562 


1/1 


100 


217 .. 


1 


119 [] 


128.4 


2.1e-35 




PF01400 


1/1 


363 


373 .. 


91 


101 .. 


2.6 


7.8 


in 


CE00423 


1/1 


364 


375 .. 


222 


233 .. 


4.5 


0.99 


q 

( a' 


PF01421 


1/1 


230 


428 .. 


1 


200 [. 


259.3 


5.3e-74 


|! SK? 

II 


CE00385 


1/1 


447 


518 .. 


1 


67 [. 


26.5 


5.4e-06 




PF00200 


1/1 


447 


523 .. 


1 


76 [] 


70.0 


3.4e-22 


ill 


PF00035 


1/1 


734 


766 .. 


1 


37 [. 


7.2 


1.2 



'i 3 
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1 TTGGGTGACC CTGGGCAGTG ATCACATCTC CAAGCATCAG TTTTCTCACC 
51 TGAAAAAAAG GAGATGATAA TAACACTATC TGCCTTACAT GACAATTGAA 
101 TTGAAI I I I I I 1 1 1 1 I I I I I TGAGACTAAG TCTCACTCTG TCGCCCAGGC 
151 TGGAGTGCAG TGGCGTGATC TTGGCTCACT GCAACCTCCA CCTCCCCAGT 
201 TCAAGCGATT CTCGTGCCTC AGCTTCCCGA GTAGCTGGGA TTACAGGCAC 
251 ACACTACCAC GCCCGGCTAA TTTAGAATTG AAATAATTTA TGTACAGTAT 
301 CTTAGTACAG GACCTGACAT TATAAACAAT GAGTGGCAGC CATTCTTATT 
351 TAATCAGTCC TAACAAAGTT CATAAAAGTG AGACTGTGTT TGCTTAGCTT 
401 TTTCCCTAGG GCCTGGATAC CCCCAGCCCC CATGACACAC AATAGGGGCC 
451 AAATGAATGT GTTGTGAAAA AATGAAAAAC AAAAAACAAA AAAGAACATG 
501 CTGGGATTCC TTGACAGGGT CGTGAAGCAA ACTGAATGTG AATGCACAGA 
551 TGGAAATGTG CCAGACAGTC ATTCCAAGCA GAATGTGCAA AGACTCAGTC 
601 CACAGGGAAT GCGAAGTGCC AGGGCTAGTC TCAGGAGAAA CTGGCTCAGA 
651 AGAGACAGCT CTCAGGGAGG GCTAAAGTAG GAAAGAGGCT AGAAAGGGAC 
701 CAGGTGAGGG AAGGCTCTGA AGGCCAAGCC CAAGAGTTCT GCCTGTCTGG 
751 CAGGCAGCAG GGCCTCTGGA GTTTCTTGGG CAAAGAGTGG CTGCTTCCTG 
801 GGTAAGGTGG CCTGTGGAAA ATCCCTGACA ACTGTGTAGA GACATGTCGT 
851 GAGGGATGGC AGGGAGCATA GTGAACTAGG TTTGTGGTTT GGAATCAGGG 
901 CCCCTGGGGT CCAGCCAAGT TGGATTGTTT ACTATCTGTG TGACTTTGAG 
951 AGTCACTTCA CCTTTCTCAA CTGTAAAGTG GGGATAGCAA CAGTGATAGT 
1001 CGATCTGGCC TGCTCACTTC TCAGCCTCAC TGTGAGAACC AAATAAGATG 
1051 ATTTACAGGA AAGTGCAAAT GAGAGTTGTG GCTGATATCC GCTTGGAGAG 
1101 AGCCTGGAGG GTGCATCCTC CCATTCTCCA TCACAGAGTT GGGGAGGGAG 
1151 GCACCCTCGC CCTCCAGGGG TTTCCTTTGT CCAACCCAGC CTCCTCCAAC 
1201 ACGCGGGAAT TGTCAGGCCT GGCGACTTCA GACAGGAAAC GCTGTCCAGT 
1251 TCCCCTTCTT TCCCGCCTCG CTCCCGGGCT GGCGCTAACG CCCACCTCCC 
1301 AACAGCGCCA CCCGCTGGCG GATATCCTGC ACCGCGGCTG CCCGCTCCTG 
1351 CGCCGCTGGC TGTGCCGGCG CTGCGTGGTG TGCCAGGCAC CCGAGACGCC 
1401 CGAGTCCTAC GTGTGCCGGA CGCTGGACTG CGAGGCCGTG TACTGCTGGT 
1451 CGTGCTGGGA CGACATGCGG CAGCGGTGCC CGGTCTGCAC GCCCCGCGAA 
1501 GAGCTCTCTT CCTCCGCCTT TAGTGACAGC AACGACGACA CTGCCTACGC 
1551 GGGGTGAAGA GGCGTCCTGC TCGCTCTTCC GCACCGTCCT TCCCGGTTAA 
1601 TAAAATGCCC TGTACGCTTC ACGTGGGTCG GGGACTGGGG TGAGCCGCGC 
1651 ACTGCCTCGC CTGCAGTCGG GAAAGCCTGC CCGCCCGACC TCTCCGAGCC 
1701 AGGCCGCGCA CAGGAGGCAG GGAGGCCGCG AAGCTACTAG GGAGGGGTCC 
1751 GGACCTGGCG CCGGGTGAAG GCGCGCCGCC CAAGCCGGTC GGACCGGGCA 
1801 CCGGCTCCCA CTCCGCACAG TTGCGGGGAA GCGGTAGCGC TGAGCAGCGC 
1851 GGGCGTAGTG GGCGGTGTCC CCGCTCCCGA GGCACCCGGC GCGCAGCGGG 
1901 GCGGGCTTTG CCGGGGGCGG AGCTTGGCTT GGGGCCGGGT GGGAGGGGGC 
1951 GGGCCGGGGC GGGGCCTGGT GGCCGCGCGG CGCTGCTGGG TTCTCCGAGG 
2001 CGACCTGGCC GCCGGCCGCT CCTCCGCGCG CTGTTCCGCA CTTGCTGCCC 
2051 TCGCCCGGCC CGGAGCGCCG CTGCCATGCG GCTGGCGCTG CTCTGGGCCC 
2101 TGGGGCTCCT GGGCGCGGGC AGCCCTCTGC CTTCCTGGCC GCTCCCAAAT 
2151 ATAGGTGAGT CCTCCGCCTG GAGTGGGTCG GGGGGCGGAC TGGGAGGGAG 
2201 GTGCAGGAAA GTCGGAAGGC ATTAGGGTAA TGGGGCCGGA CGGAGACCCT 
2251 GGGAGAGCCC AGCCAGAGCG CGGCCCGCCC TGGTCCGCTG TCCTGGGCCT 
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2301 AGGGCCCGGT GACTTGGCGA TGGGGTGAAA AGAGAAGGAG GGGGGATGCC 
2351 GGCGCCCCCT GCCTCCTGCC TGGTCATCCT CTGCGCGGTC CCTGCGGACA 
2401 CTTTCAGGCT CAGGTACCAG GTACCGAGGG GCCTGTCCAG CGCCACTTCA 
2451 AGATCGTGAT GAGAGGGTCG CTGCTCCCCA GGACTGGCAT CTTCGCTGCT 
2501 CTGGGGCCTA GCTAACCGTT CCACCCGGTG CCAGGGCGCT GAGCGGGCAT 
2551 GGCTTGTAGG GTTTAGTGAA GAGGATTCTC TCTAGCCTCT ATTCCAGGCC 
2601 TGGGGCCGCC AGGCACTCCT CACCCTGGTG CTGTTGCCAC CAGTGCCTGG 
2651 CCGAGCGGGA GGGGCCCGAG ATGAGCCAGG AGAAGGGAGA ATTGGCCAGG 
2701 AAAGAGGCTG GGACACCAAC TCCTCCTTGG AACTTTCACT TCCCGCTGCT 
2751 GTCTTGGGCC GGGACCGAGA GGGCAGGCGC GGGTGGAGTG TCCGGAGGAG 
2801 AGAGGGCCAT TGTGTGTTGG GGGGGTGGGG GGTGCTCGAG GAGGAAGCAG 
2851 AGGCTGTAGG CAGCGGGTGT GCCTGACTGG GCATGAGGGT GTTTAGGGAG 
2901 GTGGGGGTGT TTGCACTGCT CACCCAGAAA TGGGCGTTCC TGGCATCTCC 
2951 GATGTGAGCG AAGGGGAGGG TGAGCGGGCA CCCGGCCACA AGGCTTAGCT 
3001 CAGTCTCGAG AGGGGGCGTT CCTGAAGTGG GGGGAGAGTG ATTGGGAGGG 
3051 AGTGGGAACC GCGGAGGGTC CTGTGAGAAC CTGGGATTGG CCGGAAGGGG 
3101 ACAAGGAGGG CCACAGGCTG CGCAAGCCGA AAGTCTTTCT TGGGGACTTG 

|;f 3151 TGAATGGGTT ggtgggtgga aagccataaa ttagagagac accctctcct 
l ;i 3201 tccagtattc ttctttaagt ctcagcatgc aatgtggaag cccctcaggt 

U 3251 ACCTAAGGGT CTTGATGGGC TGGGAGCTGG TGGATCTGAG GGCACCTGTC 
lp 3301 ACCCCCAGCC CTGCTGTCGA TTCCCTCAGT ACTGTCTTGG GGTGTCCTGG 
m 3351 GACCTGCAGG TGGCACTGAG GAGCAGCAGG CAGAGTCAGA GAAGGCCCCG 
Q 3401 AGGGAGCCCT TGGAGCCCCA GGTCCTTCAG GACGATCTCC CAATTAGCCT 
M 3451 CAAAAAGGTG CTTCAGGTGA GCTCTCACTC CCCTCTAATA AATAAACGAA 
3501 TCCACACACG CCCCGGTATA GCCAGGTGTC TCAAAGCCAA AGCTTGGCTG 
3551 AGGAGCTGGT GGGTAGAGCT CACTGTAGTG GGTCTATCCC AGGCCCAGCT 
|5 3601 GCCTCTCCCA CCACACCCCA GCACCTGGCT TCACTTATCT CCCTCTCCCT 
p 3651 CTGCACACAC GTGT ATCTGT CTGCCTCAGC CCCACCCAAC CCATCCATCT 
p 3701 CCACTGGGGA AATTGTGAAG CCAAACTTGC I I ICI I CATC TCATGTTGTC 
j!j 3751 GGTTTTCTCA GTGGGGGGAT TTGGAAAGAG TCAGGACCTT ACCAAACCCC 
3801 CCCCCCCCAC CCCATTCTAA AGCTGAGTCA GAGGAAGGGC TGGGGCTTGT 
3851 GCTGGGTCCT ACACGGTGCT TCCTCTCTGG GCAGGAAGCC GAGAAGGGGT 
3901 GGCTCAGATA CCTTCCTTGA CCTCCGCACA CAACCCCCCA GAACAATGCT 
3951 CCAGGCCAGG CAGGGTTTCC TGGCCCCTCC CCTGGGATCC CCCCACCAGT 
4001 GATCTAATTG CTGGTGCTCT TCTGTGGGCC TGAGGT7TTC TGGTTAGAGA 
4051 GGCTGGGAGT TGTGGACAGG TCTAGGGAGG TGACCTGCCC TCTGGTGCCC 
4101 ACAGACCAGT CTGCCTGAGC CCCTGAGGAT CAAGTTGGAG CTGGACGGTG 
4151 ACAGTCATAT CCTGGAGCTG CTACAGAATA GGTAATAGTG ATGGTGGCAA 
4201 TAACAGTGAC CACATGGCCA ACAACTTGTA TAGCATTTAT TATGTGCCAG 
4251 GTACTAAGTG CTTGTGCTCA TTTAATCCTC ATAACAGCCC TATAAGGGAT 
4301 ATACTATCAT GTATTATTGT CCTCACTTTA TACATGAGGA AGTCAAGGCA 
4351 CAGAGAGATT AAATAACTTG CCCCAGGTCA CACAGCTAGT ATGTGGTGAA 
4401 AACCAGATTG GAATTCAAAT AAACTAACAG AGTCAGTGGC CCAACCAGTA 
4451 TACTTTGCTG CCCCGGGGTC AGGAGTGGAA AAGTTGGCTG CGGGGGTTGC 
4501 CTGGTCCCCA GCCCCACAAC CACCTTCAAG CCTCTGCTTG TCAATGCACC 
4551 GACCCTGGGA AGTGGCTTTA GCACTGCCTT CI I 1 1 ICI IC ACTTCACAGG 
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4601 GGAGTTGGTC CCATGTCCGC CCCGACCCTT GGGGTCCGGC WCCCCTCT 
4651 CCCCCOTCG GCGCCGCCCC TTCCCTTTTC I 1 1 C_ I ICCCC TCCGCTTTCG 
4701 TCCTTTTGCC TCCCCCGTGC CGTTGCGCGT TCCTTCTTCC CCGTTCCCTC 
4751 TCCCCTCTTT TGTTCCCTCC CGTTCTTTTC TCCCCCGCGT TCTTTCCTCC 
4801 TCCTTTTCGG TCCGCCCTCG CCTTCCTCCC TTCCCCTTCT GCCCTTCGCC 
4851 NTTTCTCCCT CTCGTTCTTC CTCGGTGTCG CGTCGTCCCG GCTCGGCCTT 
4901 TCCCCGCTTC CTCCCGCTCG CCGI 1 1 1 1 1 1 CCCCCCGCTG TCTTCCCGTG 
4951 TTCCCCTTCG CTTCTCCTCT TCCCTTTCGT TCGGTCGTTT TCTCGTTCCA 
5001 TTCCCGCCTC CCCGTTTCCG TTCCACTCCT TCTTCCTCCT TTCCCGCTCC 
5051 CCGTTTCTCC CGACCCCAAC AACAAATAAA NNNNNNNNNN NNNNNNNNNN 
5101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5401 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
5451 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNTCAGG 
5501 AGGCCGAGTG GAAGAATCGC TTGAGCCCAG GTAGGCAGAG GTTTCAGTGG 
5551 GCCGAGATCG AGCCACTACA CACCAGCCTG GGTGAAAGAG TGAGACCTCG 
5601 TCTCAAAAAA TAAAATAAAA ATAAAATAAA ATAAAATCTA GCTGAGACAG 
5651 ATTAGGTGGT TTGCCCGAGG CCCTACAACT AATAAATGGC CTATCCATTT 
5701 ATTAGTTGTA TTTGGCTCTT CATCTGTCTT ATGATCCCAT TTGCAGAGAG 
5751 CTCTCACTTG GTTATAGATA ATACATAGTT ACCAATGATG AAGCAATATA 
5801 AACCCAATTT CCTAATTTGT AAAATGAAGA TAATAAAACT ACTTGCTGCA 
5851 TAGAGTTGCT GGGAAGATTA AATAAGTCCA TATAGATGTA AAGTGCTTAA 
5901 AACTATGCCA GACCTATGGT AAGTGACAAG AGTTGTTATT GGG A I 1 1 I I A 
5951 AAATTATTAT TATTATTATT ATTATTATTT GAGACAGAGT CTCGCTCTGT 
6001 CTCCCAGGCT GGAGTGCAGT GGCGTGATCT CGGCTCACTG CAAGCTCCGC 
6051 CTCCCAGGTT CACGCCATTC TCTTGCCTCA GCCTCCCGAG TAGCTGGGAC 
6101 TACAGGCGCC CGCCACTACA CCCGGCTAAT GTTTTGTATT TTTTAGTACA 
6151 GACAGGGTTT CACCGTGTTA TCCAGGATGG TCTCGATCTC CTGACCTCAT 
6201 GATCCACCCG CCTTGTCCTC CCAAAGTGCT GAGATTACAG GCGTGAGCCA 
6251 CCGCACCCAG CTAAATTACT Gl I I I 1 1 AAA AATTTGAAAA AAACCACTGA 
6301 GTTTGGAGCC AGAAAAGCAG GGGTCTACTC CAACCTTCAT TATCTACTTC 
6351 CTGGTCCTCC TTGGCAAGTT CCTGGGCCCT CTGGCCTTCA GTGGCTCATC 
6401 TGTAAAATGG GCTCTTCACC CTCCTATTTG ACCCACAGAG TAGGAGTGGC 
6451 TGCCTCTTGG TCAGCCCGGC ACAGCTGCTG GCTGCGAGCG GCAGGTTTGC 
6501 CTGATAATTC TTCTTGTCCA TAGTAGAGGC GGGATGTGGT AACAGAGACC 
6551 AAGACTGTGG AGTTGGTGAT TGTGGCTGAT CACTCGGAGG TGAGCCTGCT 
6601 GGCCCCTGCA CATCCTCCTC CCCCTGCACT GCCCTGCCGC CTTTCATGTC 
6651 ACCTCTCTTG GCCTACAGGC CCAGAAATAC CGGGACTTCC AGCACCTGCT 
6701 AAACCGCACA CTGGAAGTGG CCCTOTGCT GGACACAGTG AGTGCTGGAC 
6751 AGGGCAACCC CCACCCCAGG CCCCTGACCA TGGCAACCCC TCTTCTGAGC 
6801 CCCAGCTGTC TTTCAGTTCT TCCGGCCCCT GAATGTACGA GTGGCACTAG 
6851 TGGGCCTGGA GGCCTGGACC CAGCGTGACC TGGTGGAGAT CAGCCCAAAC 
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6901 CCAGCXGTCA CCCTCGAAAA CTTCCTCCAC TGGCGCAGQG CACATTTGCT 
6951 GCCTCGATTG CCCCATGACA GTGCCCAGCT GGTGACGTAA GGGCCCCAGA 
7001 CTCAGCCAGA GAGGCCAGTC CTGTCCTGGC CAAATTCACA CCCCTTCAGC 
7051 ACCCTACCTC AGCCCCTGAA GCTCTGACCA CCGTGGCTTC TGGCCCTGAA 
7101 CTTTAGCCTC TCTGTCCCAC AGTGGTACTT CATTCTCTGG GCCTACGGTG 
7151 GGCATGGCCA TTCAGAACTC CATCTGTTCT CCTGACTTCT CAGGAGGTGT 
7201 GAACATGGTG AGTTATTTCC AGGTCTCCTC CTCATTCCCA ATTCAGTTCC 
7251 TCCCAAGTGT GGTGGCATTT ATGCACTGAA ACCCCCCTAT AAAGTTGCCC 
7301 AACCCCAAAG CTACAGGTAT AGAGGGTGGA GGTACGTGAT GTGGCCTTTG 
7351 CTATCAGGGA GCCCTCGCTT ATGGCCAGCT AGTCACAGTG TACACAGTCA 
7401 TCCCCTGTGC AGTCTTCCCA TTTCTTAGAG GAGGGTAGGA GGCAGCTAAG 
7451 GCCCAAAGAA CAGAGGTGAT CTCCCTCCAG TGAGGGAGGG GGACAGAGCT 
7501 GAGCTAGAAC CCAAGTTTCT GCCATCCAGG CCTGGGTTCT CCTACTTTAG 
7551 AAGCAATTCA GGAGGGAAGC AGTGCCTGCT GAGTGCCCAC GAGGTCAGAC 
7601 GTGGAGGGAA CAGGAGCAGA GAGGGTGGTC TGGGCATTGT GGTGGAGGCA 
7651 GGCTGGGACT GGACCTACAG TACCCCTCCC CAATGACAGG ACCACTCCAC 
7701 CAGCATCCTG GGAGTCGCCT CCTCCATAGC CCATGAGTTG GGCCACAGCC 
7751 TGGGCCTGGA CCATGATTTG CCTGGGAATA GCTGCCCCTG TCCAGGTCCA 
7801 GCCCCAGCCA AGACCTGCAT CATGGAGGCC TCCACAGAGT AAGTAGCTGC 
7851 AGGATGGAGA GAGGGTGTGG GGCAGGGGGC AGGGANNNNN NNNNNNNNNN 
7901 NNNNNNNNNN TGTTAGAGTT ACCTTCCTTG CCACCCTCCC CAGCTTCCTA 
7951 CCAGGCCTGA ACTTCAGCAA CTGCAGCCGA CGGGCCCTGG AGAAAGCCCT 
8001 CCTGGATGGA ATGGGCAGCT GCCTCTTCGA ACGGCTGCCT AGCCTACCCC 
8051 CTATGGCTGC TTTCTGCGGA AATATGTTTG TGGAGCCGGG CGAGCAGTGT 
8101 GACTGTGGCT TCCTGGATGT GAGCCCCTTT CCCAAAGCCT CGCCCCACTC 
8151 ACTTCTGTAC CCTCACCCTG GCTCATTAGC CCTATCCCAG CCTCCTGAGC 
8201 TCTTGGGTTC TGAAGGGACT TTCCACCCCT CTCCTACTTG CCCTGTCTGT 
8251 GGGGACAGCA CATGGGTTGT TGGGCTCTAG CCCTCGCTTG CTGTGTAGCT 
8301 TCTGGTCTTG GCCTGTGGGA GGAGGAGAGA TTGGAGGGAG GCTCACAGGC 
8351 CCCACCTGCT CTGATGCCCG GCCCCCGTGC TCCTGCCCAC AGGACTGCGT 
8401 CGATCCCTGC TGTGATTCTT TGACCTGCCA GCTGAGGCCA GGTGCACAGT 
8451 GTGCATCTGA CGGACCCTGT TGTCAAAATT GCCAGGTGGG TAGAGACTAG 
8501 ACTGGCCACC CGGAGCTCAC CTGCCGGGGC CAAGGTGGAA AGGGTCATTC 
8551 TGACCCCCGG CTGGATTTGC TCAGTGCCCA CACTGATGCT CATCCACCCT 
8601 CCACAGCTGC GCCCGTCTGG CTGGCAGTGT CGTCCTACCA GAGGGGATTG 
8651 TGACTTGCCT GAATTCTGCC CAGGAGACAG CTCCCAGTGT CCCCCTGATG 
8701 TCAGCCTAGG GGATGGCGAG CCCTGCGCTG GCGGGCAAGC TGTGTGCATG 
8751 CACGGGCGTT GTGCCTCCTA TGCCCAGCAG TGCCAGTCAC TTTGGGGACC 
8801 TGGAGCCCAG CCCGCTGCGC CACTTTGCCT CCAGACCGCT AATACTCGGG 
8851 GAAATGCTTT TGGGAGCTGT GGGCGCAACC CCAGTGGCAG TTATGTGTCC 
8901 TGCACCCCTA GGTAAGTGAG GAAACCTGGC TCCTCCTTTG GGTTTCTGAG 
8951 AGCCTTGGCC CTGCTCCTAC TAACTCTGTG TGCCCTTCCC CCTCNNNNNN 
9001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNTTACGG 
9051 CATTTGTAGT TACTCACACT TTTGCCTTCA NACAGCTAAT ACTCGGGGAA 
9101 ATGCTTTTGG GAGCTGTGGG CGCAACCCCA GTGGCAGTTA TGTGTCCTGC 
9151 ACCCCTAGGT AAGTGAGGAA ACCTGGCTCC TCCTTTGGGT TTCTGAGAGC 
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11501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

11951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12051 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

j;* 12351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12401 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

u 12451 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

7: 12501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

0 12551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

□ 12601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

i"* 12651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

|,'* 12751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

P 12851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

p 12901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

12951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13051 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13401 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13451 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 

13751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
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13801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
13851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
13901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
13951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14051 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14101 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14151 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14201 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14251 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14301 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14351 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14401 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14451 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
14751 NNNNNNNNNN NNNNNNNNNN NNNNNNTTTT TGAAAGCXAC TAGTAGGTCA 
14801 CCA 1 1 1 1 1 IC TTGTCTTCCC GCAATCCAGA CCAGCGCCAC CGCCTCCGAC 
14851 ACTGTCCTCG CTCTACCTCT GACCTCTCCG GAGGTTCCGC TGCCTCCAAG 
14901 CCGGACTTAG GGCTTCAAGA GGCGGGCGTG CCCTCTGGAG TCCCCTACCA 
14951 TGACTGAAGG CGCCAGAGAC TGGCGGTGTC TTAAGACTCC GGGCACCGCC 
15001 ACGCGCTGTC AAGCAACACT CTGCGGACCT GCCGGCGTAG TTGCAGCGGG 
15051 GGCTTGGGGA GGGGCTGGGG GTTGGACGGG ATTGAGGAAG GTCCGCACAG 
15101 CCTGTCTCTG CTCAGTTGCA ATAAACGTGA CATCTTGGGA GCGTTCCCCA 
15151 GAGTTTGTCT GCTTCTAGAA CCCGGGTCGC TCCTGCTGCG GTTCCAGGTT 
15201 TGGCCGCCAG AAGACGCTGC CGCCTCAGAC GAGGGCGGGC TGTGTGGGGC 
15251 GGGAGTACCA GAAAGGGTCG GCGTGTGTCC CCGGGATGCT CGCAGCTTCC 
15301 CTCTGCCCAG ACTGGGGTGG CTTTCGGCGC AATCTGTCAA GCTGTTGGAC 
15351 CTGCCGTCCC CACTCTGACC ATTGGCTGGG AAAAGTGGAT CTGGCTGATG 
15401 CTCCCAGAGC CCAGGAGCCA GGGCGGAGCG GGGCGGCGGC TGCTCCCACG 
15451 ATCCCAAGGC CGCGCACCTG CCTCCTCCCC CTCCGCCGCC GCCACTTGAG 
15501 GGATCGGGAA CAAAGGTGCT TTGTACAGGC CGCAACCACC TCATTACTTC 
15551 GTCTTAGGGA CTGGGGCCGC GTGGGCCCCC AGCCCGGAAC GAAGGTGTGG 
15601 AGCGGCAAGG GACAGACGCC AATCTTAAAG TGAGCATCTA GCGCGCCACC 
15651 TAAGGCTCTT TAGGGAAGGT GGTCCCAGAG CTGTGTTGTC CCTTCCGCTT 
15701 GCACTGTCCC TAGATGTGCA AAGAAAACGG GGCAGTGCAT GAAGGTGGTT 
15751 GGACAGGCTT CATGGATCCT CGCCCGCGCC TCACTTTCCC CTATCTGGGC 
15801 AAAGGTTATG TACCCTTATT TAAAATCTTC CAAACTTCTA ATAAGGCAGT 
15851 CTACCCTGCA CTAAAGCAGA CACGAAAGAG ATGACCTCCC TAAAAATACT 
15901 GCTG1TGGAA TACGTCCTTC CTTCCCGCCC CCTCGCAGTG CGGTGCAGCC 
15951 TCAGTGGAAG CTTTGGCGAA CCTGGCGCGC GCTGCGGTGC ACAGAGGGTT 
16001 AACTGGAGTT GGCGCTGGGT GGAGAGGAGG AGACGCGCTC CCATTGGCGG 
16051 AAAGTTATTC AGGGGCGGGG TCAGTGAATC TCCGTACCCC ACTCCCCTTT 
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16101 CCGCAACTTC CCTCTTCACT TTGTACCTTT CTCTCCTCGA CTGTGAAGCG 
16151 GGCCGGGACC TGCCAGGCCA GACCAAACCG GACCTCGGGG GCGATGCGGC 
16201 TGCTGCCCCT GCTGCGGACT GTCCTATGGG CCGCGTCCTC GGCTCCCCTC 
16251 TGCGCGGGGG CTCCAGCCTC CGCCACGTAG TCTACTGGAA CTCCAGTAAC 
16301 CCCAGGTAGC CGGGCCGAAC CGGGCGAGCG CACAGCCAAG TCTGCGCGCT 
16351 CCCGGGCTTT GCGCGCGCCC GCCACCCGCT CTTTGCGCGG CGCCGCCTGA 
16401 GCCTGGCCGC GCGCCGGGGC TCCTTTGTTT GAGCCGGCGG GGGAGGGGGG 
16451 AGGGGCGAGG GGCGAGGCGC GCCCTGGGTC TCCCCACAGC CCGCATGTGT 
16501 TGGGGGGCAG GCAGAAGACC CCAGCCCCAA GGGTTGTCTA GGGGGTCTTG 
16551 GAGCATGGAG CTGGGGGGGC CTTTGCCCGC ACTCCGGGCT CCGCCCCCCT 
16601 CGCTGCTCTC CTGGCGATCC CCAGCCTCCC GCAGGCTGGA GCTGTGGCTG 
16651 ACGAACTTGA GAGCGAGGGA GGGGGCTTTA CTCTTATGAA AGAGCGTGGG 
16701 TTACTCTCCT GCCCGCTGGG TCTCACCTCT GGCTCTCACT CTGTCTCCTG 
16751 ATCTCATTTG CTATCTCTGC TTTCATCTCT GTCTTTATTG GTCCTTCTGT 
16801 TTCTTTCCAG TGTCAGCCCT GCCCTTCTAG CCGAATCACC TCTGGGCAAG 
16851 TCTCGTGACC TTCCTAACCT CATTTATCTC ACCTGTATAA TGGGCTAATA 
16901 ATACCTAGTA CCCTGGGAAG TCTGGCAGGG TAAGTGAGGT CATGTATGTG 
16951 AAAGAGGCTC AGGCTGTACA GATATAAACT ATTATTTCTT TCTCTCTCCT 
17001 GAGCTGCCTG CCTTTGAACC TTAGTATATT TTACTGTTTC CATCCCCCTC 
17051 CCCAAGTCTC CCTGCCTCTC CTATTTCCTA TCTGTTTTTC TTTCTGATTT 
17101 TCTACTTGAG ACAATCTGTG ACTATTCATT TCTTCACT 
(SEQ ID NO: 3) 
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2522 TTAGGGTMTGGGGCCGGACGGAGACCCTGGGAGAGCCCAGCCAGAGCGCGGCCCGCCCT 
GGTCCGCTGTCCTGGGCCTAGGGCCCGGTGACTTGGCGATGGGGTGAAM 
GGGGATGCC(^GCCCCCTGCCTCCTGCCTGGT(^TCCrCTGCGCGGTCCCrGCGGACAC 
TTTCAGGCTCAGCTACCAGCTACCGAGGGGCCTCT 

AGAGGGTCGCTGCT CCCCAGGACTGGCATCTTCGCrGCTCrGGGGCCTAGCTAACCGTTC 
[C,G] 

ACCCGGTGCGVGGGCGCTGAGCGGGCATGGCTTGTAGGGTTTAGTGMGAGGATTCT 
TAGCCTCTATTCCAGGCCTGGGGCCGCCAGGCACTCCTCACCCTGGTGCT 
GTGCCTGGCCGAGCGGGAGGGGCCCGAGATGAGCCAGGAGMGGGAGMTTGGCCAGGAA 
AGAGGCTGGGACACCMCTCCTCCTTGGMCTTTCA 

GACCGAGAGGG(^GGCGCGGGTGGAGTGTCCGGAGGAGAGAGGGC(^TTGTGTGTTGGGG 
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4326 GGGCCTGAGGTTTTCTGGTTAGAGAGGCTGGGAGTTGTGGACAG 

TGCCCTCTGGTGCC(^(^GACG\GTCTGCCrGAGCCCCTGAG(^T(^GrrGGAGCTGGA 

CGGTGACAGTCATATCCrGGAGCTGCTA(^GMTAGGTMTAGTGATGGTGGCMTM^ 

GTGACCACATGGCCA^CMCTTGTATAGCATTTATTATGTGCCAG 

GCTCATTTMTCCTCATMCAGCCCTATMGGGA^ 

[C,T] 

TTTATACATGAGGMGTCAAGGGACAGAGAG^ 

TAGTATGTGGTGAAMCG^GATTGGAATTCAAATAAACrMCAGAGT(^GTGGCCCAACC 

AGTATACTTTGCTGCCCCGGGGTCAGGAGTGGAAAAGTTGGCTGCGGGGG 

CCCAGCCCCACAACCACCTTQ\A^ 

TTTAGCACTGCLI I LI I I I IL I I CALTTCACAGGGGAG7TGGTCCCATGTCCGCCCCGAC 

5954 AGGTGGTTTGCCCGAGGCCCTACMCTMTAMTGGCCT^^ 

GGCrCrrCATCTGTCTTATGATCCCATTTGCAGAGAGCrCTC^ 
CATAGTTACCAATGATGMGCMTATAMCCCAATTTC 
TAAMCTACrrGCTGCATAGAGTTGCTGGGMGATTAAATAAGTCCA^ 
TGC1TAAAACTATGCCAGACCTATGGTAAGTGACAAGAGI IGI IATTGGGAI I I I IAAAA 
[T,-] 

TATTATTATTATTATTATTATTATTTGAGACA^ 

TGCAGTGGCGTGATCTCGGCTCACTGCMGCTC^ 

GCCTCAGCCTCCCGT^GTAGCTGGGACTACAGGCGCCCGCCACTACACCCG^ 

TGTAI I I I I I AGTAG\GACAGGGTTTCACCGTGTTATCCAGGATGGTCTCG7\TCTCCTGA 

CCTCATGATCCACCCGCCTTGTCCTCCCAMGTGCTGAGATT^ 

6783 TGCGAGCGGCAGGTTTGCCTGATAATTC I I LI I GTCCATAGTAGAGGCGGGATGTGGTAA 

CAGAGACCMGACTGTGGAGTTGGTGATTGTGGCTGATC 
CCCCTGCACATCCTCCTCCCCCTGCACT^^ 
CTACAGGCCCAGAMTACCGGGACTTCCAGCACLTGCTAMCCGCACT^ 
CTCTTGCTGGACACAGTGAGTGCTGGACAGQGCMCCCCCACCCC^ 
[G,A] 

CAACCCCTCTTCTGAGCCCCAGCTGT CTTTCAGTT CTTCCGGCCCCTGAATGTACGAGTG 

GCACTAGTGGGCCTGGAGGCCTGGACCCAGCGTGACCTGGTGGAGATCAGCCC 

GCTGTCACCCTCGAAMCTTCCTCCACTGGCGCAGGGCACATTTOT 

CATGACAGTGCCCAGCTGGTGACGTMGGGCCCCAGACTCAGCCAGAGAGGC 

TCCTGGCCAMTT(^CACCCCTT<^GCACCCTAC(TCAGCCCCTG^ 

7514 TATTTCCAGGTCTCCTCCTCATTCCCAATTCAGTTCCTCCC 

CACTGAMCCCCCCTATAMGTTGCCCMCCCCAMGCrACAGGTATAGAGGCT 
ACGTGATGTGGCCTTTGCTATCAGGGAGCCCTCGCrTATGGCCAG 
ACAGTCATCCCCTGTGCAGTCTTCCCA 1 1 I LI I AGAGGAGGGTAGGAGGCAGCTAAGGCC 
CAMGMCAGAGGTGATCTCCCTCCAGTGAGGGAGGGGGACAGAGCTGAGCTAGM 
[A,C] 

GTTTCTGCCATCCAGGCCTGGGTTCTCCTACTTTAGMGCA^ 

CCTGCTGAGTGCCCACGAGGTCAGACGTGGAGGGMCAGGAGCAGAGAGGGTOT 

CATTGTGGTGGAGGCAGGCTGGGACTGGACCTACAGTACCCCTCCCCM 
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CTCCACCAGG^TCCTGGGACT 

CCTGGACCATGATTTC^CTGGGAATAGCTGCCCCrGTCCAGGrCCACXICCCAGCC^AGAC 

15505 CGCCAGMGACGCTGCCGCCTCAGACGAGGGCGGGCTGTGrGGGGCGGGAGTACCAGAAA 
GGGTCGGCCTCTGTCCCCGGC^TGCTCGCAGCrrCCCTCTGCCCAGACTGGGGTGGCm 
CGGCGCMTCTGT^GCrGTTGGACCTGCCGrCCCCACTCTGACCATTGGCrGGGAAM 
GTGGATCTGGCTGATGCTCCO\GAGCCCAGGAGC(^GGGCC^GCGGC^GGCGGCTGCT 
CCCACGATCC^GGCCGCGCACCTGCCTCCTCCCCCrCCGCCGCCGCCACTTGAGGGAT 
[C,T] 

GGGMCAMGGTGCTTTGT ACAGGCCGCMCCACCTCATTACTTCGTCTrAGGGACrGGG 

GCCGCGTGGGCCCCCAGCCCGGMCGAAGGTGTGGAGCGGCMGGGACAGACGCCAATCr 

TAAAGTGAGCATCTAGCGCGCCACCTMGGCTCTTTAGGGMGGTGGrCCCAGAGCrCT 

TTGTCCCTTCCGCXrGCACTCT^^ 

TGGTTGGACAGGCrrGATGGATCCTC^ 

16123 AMTCTTCCAMCTTCTMTMGGCACT^ 

GACCTCCCTAAAMTACTGCTGrTGGMTACGTCCTTCCTTCCCGCCCCCrCGCAGTGCG 

GTGCAGCCTCAGTGGMGCTTTGGCGAACCTGGCGCGCGCrGCGGTGG^CA^ 

CTGG^GrrGGCGCTGGGTGGAGAGGAGGAGACGCGCTCCCATTGGCGGAMGrrATTCAG 

GGGCGGGGTCAGTGMTCTCCGTACCCCACrCCCCTTTCCGCMCTTCCCTC^ 

[A,G] 

TACCTTTCTCTCCTCGACrGTGMGCGGGCCGGGACCrGCCAGGCCAGACC^ 

CTCGGGGGCGATGCGGCTGCTGCCCCTGCTGCGGACTGTCCTATGGGCCGCCT 

TCCCCTCTGCGCGGGGGCTCCAGCCTCCGCCACGTAGTCTACTGGMCTCCACT 

AGGTAGCCGGGCCGMCCGGGCGAGCGCACAGCCMGTCTGCGCGCTCCCGGGCTTrGCG 

CGCGCCCGCCACCCGCTCTTTGCGCGGCGCCGCCTGAGCCTGGCCGCGCGCCGGGGCTCC 

Chromosome map: 
Chromosome # 1 
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